
<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
<title>唐松 Santos</title>
<meta name="description" content="这是唐松 Santos的个人博客，《Python 网络爬虫: 从入门到实践》作者" />
<meta name="keywords" content="唐松 Santos, Python, 网络爬虫, Python 网络爬虫: 从入门到实践, Python 爬虫, 大数据" />
<link rel="apple-touch-icon" href="http://www.santostang.com/wp-content/themes/JieStyle-Two-master/images/icon_32.png">
<link rel="apple-touch-icon" sizes="152x152" href="http://www.santostang.com/wp-content/themes/JieStyle-Two-master/images/icon_152.png">
<link rel="apple-touch-icon" sizes="167x167" href="http://www.santostang.com/wp-content/themes/JieStyle-Two-master/images/icon_167.png">
<link rel="apple-touch-icon" sizes="180x180" href="http://www.santostang.com/wp-content/themes/JieStyle-Two-master/images/icon_180.png">
<link rel="icon" href="http://www.santostang.com/wp-content/themes/JieStyle-Two-master/images/icon_32.png" type="image/x-icon">
<link rel="stylesheet" href="http://www.santostang.com/wp-content/themes/JieStyle-Two-master/css/bootstrap.min.css">
<link rel="stylesheet" href="http://www.santostang.com/wp-content/themes/JieStyle-Two-master/css/font-awesome.min.css">
<script type="text/javascript" src="http://www.santostang.com/wp-content/themes/JieStyle-Two-master/js/jquery.min.js"></script>
<script type="text/javascript" src="http://www.santostang.com/wp-content/themes/JieStyle-Two-master/js/bootstrap.min.js"></script>
<link rel="stylesheet" href="http://www.santostang.com/wp-content/themes/JieStyle-Two-master/style.css">
<link rel="pingback" href="http://www.santostang.com/xmlrpc.php" />
<link rel="icon" href="http://www.santostang.com/wp-content/uploads/2018/07/cropped-IMG_0423-min-32x32.jpg" sizes="32x32" />
<link rel="icon" href="http://www.santostang.com/wp-content/uploads/2018/07/cropped-IMG_0423-min-192x192.jpg" sizes="192x192" />
<link rel="apple-touch-icon-precomposed" href="http://www.santostang.com/wp-content/uploads/2018/07/cropped-IMG_0423-min-180x180.jpg" />
<meta name="msapplication-TileImage" content="http://www.santostang.com/wp-content/uploads/2018/07/cropped-IMG_0423-min-270x270.jpg" />
<style type="text/css">
a{color:#1e73be}
a:hover{color:#2980b9!important}
#header{background-color:#1e73be}
.widget .widget-title::after{background-color:#1e73be}
.uptop{border-left-color:#1e73be}
#titleBar .toggle:before{background:#1e73be}
</style>
</head>

<body>
<header id="header">
    <div class="avatar"><a href="http://www.santostang.com" title="唐松Santos"><img src="http://www.santostang.com/wp-content/uploads/2018/07/IMG_0423-min.jpg" alt="唐松Santos" class="img-circle" width="50%"></a></div>
    <h1 id="name">唐松Santos</h1>
    <div class="sns">
                <a href="https://www.zhihu.com/people/santostang" target="_blank" rel="nofollow" title="Weibo"><i class="fa fa-weibo" aria-hidden="true"></i></a>        <a href="https://www.linkedin.com/in/santostang" target="_blank" rel="nofollow" title="twitter"><i class="fa fa-linkedin" aria-hidden="true"></i></a>                <a href="https://github.com/Santostang" target="_blank" rel="nofollow" title="GitHub"><i class="fa fa-github" aria-hidden="true"></i></a>    </div>
    <div class="nav">
        <ul><li><a href="http://www.santostang.com/">首页</a></li>
<li><a href="http://www.santostang.com/sample-page/">关于我</a></li>
<li><a href="http://www.santostang.com/python%e7%bd%91%e7%bb%9c%e7%88%ac%e8%99%ab%e4%bb%a3%e7%a0%81/">爬虫书代码</a></li>
<li><a href="http://www.santostang.com/%e5%8a%a0%e6%88%91%e5%be%ae%e4%bf%a1/">加我微信</a></li>
<li><a href="https://santostang.github.io/">EnglishSite</a></li>
</ul>    </div>
        <div class="weixin">
        <img src="http://www.santostang.com/wp-content/uploads/2018/07/qrcode_for_gh_370f70791e19_344-1.jpg" alt="微信公众号" width="50%">
        <p>微信公众号</p>
    </div>
    </header>

<div id="main">
    <div class="row box">
        <div class="col-md-8">
                                <article class="article-list-1 clearfix">
                <header class="clearfix">
                    <h1 class="post-title"><a href="http://www.santostang.com/2018/07/15/4-3-%e9%80%9a%e8%bf%87selenium-%e6%a8%a1%e6%8b%9f%e6%b5%8f%e8%a7%88%e5%99%a8%e6%8a%93%e5%8f%96/">4.3 通过selenium 模拟浏览器抓取</a></h1>
                    <div class="post-meta">
                        <span class="meta-span"><i class="fa fa-calendar"></i> 07月15日</span>
                        <span class="meta-span"><i class="fa fa-folder-open-o"></i> <a href="http://www.santostang.com/category/uncategorized/" rel="category tag">未分类</a></span>
                        <span class="meta-span"><i class="fa fa-commenting-o"></i> <a href="http://www.santostang.com/2018/07/15/4-3-%e9%80%9a%e8%bf%87selenium-%e6%a8%a1%e6%8b%9f%e6%b5%8f%e8%a7%88%e5%99%a8%e6%8a%93%e5%8f%96/#respond">没有评论</a></span>
                        <span class="meta-span hidden-xs"><i class="fa fa-tags" aria-hidden="true"></i> </span>
                    </div>
                </header>
                <div class="post-content clearfix">
                    <p>4.3 通过selenium 模拟浏览器抓取

在上述的例子中，使用Chrome“检查”功能找到源地址还十分容易。但是有一些网站非常复杂，例如前面的天猫产品评论，使用“检查”功能很难找到调用的网页地址。除此之外，有一些数据...</p>
                </div>
            </article>
                                                <article class="article-list-1 clearfix">
                <header class="clearfix">
                    <h1 class="post-title"><a href="http://www.santostang.com/2018/07/14/4-2-%e8%a7%a3%e6%9e%90%e7%9c%9f%e5%ae%9e%e5%9c%b0%e5%9d%80%e6%8a%93%e5%8f%96/">4.2 解析真实地址抓取</a></h1>
                    <div class="post-meta">
                        <span class="meta-span"><i class="fa fa-calendar"></i> 07月14日</span>
                        <span class="meta-span"><i class="fa fa-folder-open-o"></i> <a href="http://www.santostang.com/category/python-%e7%bd%91%e7%bb%9c%e7%88%ac%e8%99%ab/" rel="category tag">Python 网络爬虫</a></span>
                        <span class="meta-span"><i class="fa fa-commenting-o"></i> <a href="http://www.santostang.com/2018/07/14/4-2-%e8%a7%a3%e6%9e%90%e7%9c%9f%e5%ae%9e%e5%9c%b0%e5%9d%80%e6%8a%93%e5%8f%96/#respond">没有评论</a></span>
                        <span class="meta-span hidden-xs"><i class="fa fa-tags" aria-hidden="true"></i> <a href="http://www.santostang.com/tag/ajax/" rel="tag">ajax</a>,<a href="http://www.santostang.com/tag/python/" rel="tag">python</a>,<a href="http://www.santostang.com/tag/%e7%bd%91%e7%bb%9c%e7%88%ac%e8%99%ab/" rel="tag">网络爬虫</a>,<a href="http://www.santostang.com/tag/%e7%bd%91%e9%a1%b5%e7%88%ac%e8%99%ab/" rel="tag">网页爬虫</a>,<a href="http://www.santostang.com/tag/%e8%a7%a3%e6%9e%90%e5%9c%b0%e5%9d%80/" rel="tag">解析地址</a></span>
                    </div>
                </header>
                <div class="post-content clearfix">
                    <p>由于网易云跟帖停止服务，现在已经在此处中更新了新写的第四章。请参照文章：

4.2 解析真实地址抓取

虽然数据并没有出现在网页源代码中，我们也可以找到数据的真实地址，请求这个真实地址也可以获得想要的数据...</p>
                </div>
            </article>
                                                <article class="article-list-1 clearfix">
                <header class="clearfix">
                    <h1 class="post-title"><a href="http://www.santostang.com/2018/07/14/%e7%ac%ac%e5%9b%9b%e7%ab%a0%ef%bc%9a%e5%8a%a8%e6%80%81%e7%bd%91%e9%a1%b5%e6%8a%93%e5%8f%96-%e8%a7%a3%e6%9e%90%e7%9c%9f%e5%ae%9e%e5%9c%b0%e5%9d%80-selenium/">第四章- 动态网页抓取 (解析真实地址 + selenium)</a></h1>
                    <div class="post-meta">
                        <span class="meta-span"><i class="fa fa-calendar"></i> 07月14日</span>
                        <span class="meta-span"><i class="fa fa-folder-open-o"></i> <a href="http://www.santostang.com/category/python-%e7%bd%91%e7%bb%9c%e7%88%ac%e8%99%ab/" rel="category tag">Python 网络爬虫</a></span>
                        <span class="meta-span"><i class="fa fa-commenting-o"></i> <a href="http://www.santostang.com/2018/07/14/%e7%ac%ac%e5%9b%9b%e7%ab%a0%ef%bc%9a%e5%8a%a8%e6%80%81%e7%bd%91%e9%a1%b5%e6%8a%93%e5%8f%96-%e8%a7%a3%e6%9e%90%e7%9c%9f%e5%ae%9e%e5%9c%b0%e5%9d%80-selenium/#respond">没有评论</a></span>
                        <span class="meta-span hidden-xs"><i class="fa fa-tags" aria-hidden="true"></i> <a href="http://www.santostang.com/tag/ajax/" rel="tag">ajax</a>,<a href="http://www.santostang.com/tag/javascript/" rel="tag">javascript</a>,<a href="http://www.santostang.com/tag/python/" rel="tag">python</a>,<a href="http://www.santostang.com/tag/sselenium/" rel="tag">sselenium</a>,<a href="http://www.santostang.com/tag/%e7%bd%91%e7%bb%9c%e7%88%ac%e8%99%ab/" rel="tag">网络爬虫</a></span>
                    </div>
                </header>
                <div class="post-content clearfix">
                    <p>由于网易云跟帖停止服务，现在已经在此处中更新了新写的第四章。请参照文章：

前面爬取的网页均为静态网页，这样的网页在浏览器中展示的内容都在 HTML 源代码中。但是，由于主流网站都使用JavaScript 展现网页内...</p>
                </div>
            </article>
                                                <article class="article-list-1 clearfix">
                <header class="clearfix">
                    <h1 class="post-title"><a href="http://www.santostang.com/2018/07/11/%e3%80%8a%e7%bd%91%e7%bb%9c%e7%88%ac%e8%99%ab%ef%bc%9a%e4%bb%8e%e5%85%a5%e9%97%a8%e5%88%b0%e5%ae%9e%e8%b7%b5%e3%80%8b%e4%b8%80%e4%b9%a6%e5%8b%98%e8%af%af/">《网络爬虫：从入门到实践》一书勘误</a></h1>
                    <div class="post-meta">
                        <span class="meta-span"><i class="fa fa-calendar"></i> 07月11日</span>
                        <span class="meta-span"><i class="fa fa-folder-open-o"></i> <a href="http://www.santostang.com/category/python-%e7%bd%91%e7%bb%9c%e7%88%ac%e8%99%ab/" rel="category tag">Python 网络爬虫</a></span>
                        <span class="meta-span"><i class="fa fa-commenting-o"></i> <a href="http://www.santostang.com/2018/07/11/%e3%80%8a%e7%bd%91%e7%bb%9c%e7%88%ac%e8%99%ab%ef%bc%9a%e4%bb%8e%e5%85%a5%e9%97%a8%e5%88%b0%e5%ae%9e%e8%b7%b5%e3%80%8b%e4%b8%80%e4%b9%a6%e5%8b%98%e8%af%af/#respond">没有评论</a></span>
                        <span class="meta-span hidden-xs"><i class="fa fa-tags" aria-hidden="true"></i> </span>
                    </div>
                </header>
                <div class="post-content clearfix">
                    <p>本书由于是第一版，因此还存在一些差错，希望各位读者谅解。

另外，感谢各位读者的指正，现将本书的错误之处一并放在此处，方便其他读者更好阅读和使用此书。也欢迎大家在知乎私信或者留言给我，我会持续更新此...</p>
                </div>
            </article>
                                                <article class="article-list-1 clearfix">
                <header class="clearfix">
                    <h1 class="post-title"><a href="http://www.santostang.com/2018/07/04/hello-world/">Hello world!</a></h1>
                    <div class="post-meta">
                        <span class="meta-span"><i class="fa fa-calendar"></i> 07月04日</span>
                        <span class="meta-span"><i class="fa fa-folder-open-o"></i> <a href="http://www.santostang.com/category/python-%e7%bd%91%e7%bb%9c%e7%88%ac%e8%99%ab/" rel="category tag">Python 网络爬虫</a></span>
                        <span class="meta-span"><i class="fa fa-commenting-o"></i> <a href="http://www.santostang.com/2018/07/04/hello-world/#comments">1条评论</a></span>
                        <span class="meta-span hidden-xs"><i class="fa fa-tags" aria-hidden="true"></i> </span>
                    </div>
                </header>
                <div class="post-content clearfix">
                    <p>Welcome to WordPress. This is your first post. Edit or delete it, then start writing!

各位读者，由于网易云跟帖在本书出版后已经停止服务，书中的第四章已经无法使用。所以我将本书的评论系统换成了来必力...</p>
                </div>
            </article>
                                    <nav style="float:right">
                            </nav>
        </div>
        <div class="col-md-4 hidden-xs hidden-sm">
            <aside class="widget clearfix">
    <form id="searchform" action="http://www.santostang.com">
        <div class="input-group">
            <input type="search" class="form-control" placeholder="搜索…" value="" name="s">
            <span class="input-group-btn"><button class="btn btn-default" type="submit"><i class="fa fa-search" aria-hidden="true"></i></button></span>
        </div>
    </form>
</aside>
<aside class="widget clearfix">
    <h4 class="widget-title">文章分类</h4>
    <ul class="widget-cat">
        	<li class="cat-item cat-item-1"><a href="http://www.santostang.com/category/uncategorized/" >未分类</a> (1)
</li>
	<li class="cat-item cat-item-3"><a href="http://www.santostang.com/category/python-%e7%bd%91%e7%bb%9c%e7%88%ac%e8%99%ab/" >Python 网络爬虫</a> (4)
</li>
    </ul>
</aside>
<aside class="widget clearfix">
    <h4 class="widget-title">热门文章</h4>
    <ul class="widget-hot">
        
<li><a href="http://www.santostang.com/2018/07/04/hello-world/" target="_blank" >Hello world!</a></li>
<li><a href="http://www.santostang.com/2018/07/11/%e3%80%8a%e7%bd%91%e7%bb%9c%e7%88%ac%e8%99%ab%ef%bc%9a%e4%bb%8e%e5%85%a5%e9%97%a8%e5%88%b0%e5%ae%9e%e8%b7%b5%e3%80%8b%e4%b8%80%e4%b9%a6%e5%8b%98%e8%af%af/" target="_blank" >《网络爬虫：从入门到实践》一书勘误</a></li>
<li><a href="http://www.santostang.com/2018/07/14/%e7%ac%ac%e5%9b%9b%e7%ab%a0%ef%bc%9a%e5%8a%a8%e6%80%81%e7%bd%91%e9%a1%b5%e6%8a%93%e5%8f%96-%e8%a7%a3%e6%9e%90%e7%9c%9f%e5%ae%9e%e5%9c%b0%e5%9d%80-selenium/" target="_blank" >第四章- 动态网页抓取 (解析真实地址 + selenium)</a></li>
<li><a href="http://www.santostang.com/2018/07/14/4-2-%e8%a7%a3%e6%9e%90%e7%9c%9f%e5%ae%9e%e5%9c%b0%e5%9d%80%e6%8a%93%e5%8f%96/" target="_blank" >4.2 解析真实地址抓取</a></li>
<li><a href="http://www.santostang.com/2018/07/15/4-3-%e9%80%9a%e8%bf%87selenium-%e6%a8%a1%e6%8b%9f%e6%b5%8f%e8%a7%88%e5%99%a8%e6%8a%93%e5%8f%96/" target="_blank" >4.3 通过selenium 模拟浏览器抓取</a></li>    </ul>
</aside>
<aside class="widget clearfix">
    <h4 class="widget-title">随机推荐</h4>
    <ul class="widget-hot">
            <li><a href="http://www.santostang.com/2018/07/04/hello-world/" title="Hello world!">Hello world!</a></li>
            <li><a href="http://www.santostang.com/2018/07/14/4-2-%e8%a7%a3%e6%9e%90%e7%9c%9f%e5%ae%9e%e5%9c%b0%e5%9d%80%e6%8a%93%e5%8f%96/" title="4.2 解析真实地址抓取">4.2 解析真实地址抓取</a></li>
            <li><a href="http://www.santostang.com/2018/07/14/%e7%ac%ac%e5%9b%9b%e7%ab%a0%ef%bc%9a%e5%8a%a8%e6%80%81%e7%bd%91%e9%a1%b5%e6%8a%93%e5%8f%96-%e8%a7%a3%e6%9e%90%e7%9c%9f%e5%ae%9e%e5%9c%b0%e5%9d%80-selenium/" title="第四章- 动态网页抓取 (解析真实地址 + selenium)">第四章- 动态网页抓取 (解析真实地址 + selenium)</a></li>
            <li><a href="http://www.santostang.com/2018/07/11/%e3%80%8a%e7%bd%91%e7%bb%9c%e7%88%ac%e8%99%ab%ef%bc%9a%e4%bb%8e%e5%85%a5%e9%97%a8%e5%88%b0%e5%ae%9e%e8%b7%b5%e3%80%8b%e4%b8%80%e4%b9%a6%e5%8b%98%e8%af%af/" title="《网络爬虫：从入门到实践》一书勘误">《网络爬虫：从入门到实践》一书勘误</a></li>
            <li><a href="http://www.santostang.com/2018/07/15/4-3-%e9%80%9a%e8%bf%87selenium-%e6%a8%a1%e6%8b%9f%e6%b5%8f%e8%a7%88%e5%99%a8%e6%8a%93%e5%8f%96/" title="4.3 通过selenium 模拟浏览器抓取">4.3 通过selenium 模拟浏览器抓取</a></li>
        </ul>
</aside>
<aside class="widget clearfix">
    <h4 class="widget-title">标签云</h4>
    <div class="widget-tags">
        <a href="http://www.santostang.com/tag/ajax/" class="tag-cloud-link tag-link-7 tag-link-position-1" style="color:#1901a1;font-size: 22pt;" aria-label="ajax (2个项目);">ajax</a>
<a href="http://www.santostang.com/tag/javascript/" class="tag-cloud-link tag-link-8 tag-link-position-2" style="color:#7b7c7f;font-size: 8pt;" aria-label="javascript (1个项目);">javascript</a>
<a href="http://www.santostang.com/tag/python/" class="tag-cloud-link tag-link-5 tag-link-position-3" style="color:#d2bb2c;font-size: 22pt;" aria-label="python (2个项目);">python</a>
<a href="http://www.santostang.com/tag/sselenium/" class="tag-cloud-link tag-link-4 tag-link-position-4" style="color:#22824;font-size: 8pt;" aria-label="sselenium (1个项目);">sselenium</a>
<a href="http://www.santostang.com/tag/%e7%bd%91%e7%bb%9c%e7%88%ac%e8%99%ab/" class="tag-cloud-link tag-link-6 tag-link-position-5" style="color:#1f47;font-size: 22pt;" aria-label="网络爬虫 (2个项目);">网络爬虫</a>
<a href="http://www.santostang.com/tag/%e7%bd%91%e9%a1%b5%e7%88%ac%e8%99%ab/" class="tag-cloud-link tag-link-9 tag-link-position-6" style="color:#755cc3;font-size: 8pt;" aria-label="网页爬虫 (1个项目);">网页爬虫</a>
<a href="http://www.santostang.com/tag/%e8%a7%a3%e6%9e%90%e5%9c%b0%e5%9d%80/" class="tag-cloud-link tag-link-10 tag-link-position-7" style="color:#fd7f20;font-size: 8pt;" aria-label="解析地址 (1个项目);">解析地址</a>    </div>
</aside>
<aside class="widget clearfix">
    <h4 class="widget-title">友情链接</h4>
    <ul class="widget-links">
            </ul>
</aside>
        </div>
    </div>
</div>

<div class="footer_search visible-xs visible-sm">
    <form id="searchform" action="http://www.santostang.com">
        <div class="input-group">
            <input type="search" class="form-control" placeholder="搜索…" value="" name="s">
            <span class="input-group-btn"><button class="btn btn-default" type="submit"><i class="fa fa-search" aria-hidden="true"></i></button></span>
        </div>
    </form>
</div>
<footer id="footer">
    <div class="copyright">
        <p><i class="fa fa-copyright" aria-hidden="true"></i> 2018 <b>唐松 Santos</b></p>
        <p>Powered by <b>WordPress</b>. Theme by <a href="https://tangjie.me/jiestyle-two" data-toggle="tooltip" data-placement="top" title="WordPress 主题模板" target="_blank"><b>JieStyle Two</b></a> | </p>
    </div>
    <div style="display:none;">代码在页面底部，统计标识不会显示，但不影响统计效果</div>
</footer>
<script type="text/javascript" src="http://www.santostang.com/wp-content/themes/JieStyle-Two-master/js/skel.min.js"></script>
<script type="text/javascript" src="http://www.santostang.com/wp-content/themes/JieStyle-Two-master/js/util.min.js"></script>
<script type="text/javascript" src="http://www.santostang.com/wp-content/themes/JieStyle-Two-master/js/nav.js"></script>
<script type='text/javascript' src='http://www.santostang.com/wp-includes/js/jquery/jquery.js?ver=1.12.4'></script>
<script type='text/javascript' src='http://www.santostang.com/wp-includes/js/jquery/jquery-migrate.min.js?ver=1.4.1'></script>
<script type='text/javascript' src='http://www.santostang.com/wp-content/plugins/captcha-bank/assets/global/plugins/custom/js/front-end-script.js?ver=4.9.9'></script>
<script>
$(function() {
    $('[data-toggle="tooltip"]').tooltip()
});
</script>
<script>
(function(){
    var bp = document.createElement('script');
    var curProtocol = window.location.protocol.split(':')[0];
    if (curProtocol === 'https') {
        bp.src = 'https://zz.bdstatic.com/linksubmit/push.js';
    }
    else {
        bp.src = 'http://push.zhanzhang.baidu.com/push.js';
    }
    var s = document.getElementsByTagName("script")[0];
    s.parentNode.insertBefore(bp, s);
})();
</script>
</body>
</html>